{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 4. Independent components analysis"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\ell(W) = \\sum_{i=1}^m \\left(\\log|W| +\\sum_{j=1}^n \\log g'(w_j^T x^{(i)})\\right)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (a) Gaussian source"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From $s_j \\sim \\mathcal N(0,1)$ we get \n",
    "$$\\begin{align*}\n",
    "\\sum_{i=1}^m \\log g'(w_j^Tx^{(i)})  &= \\sum_{i=1}^m\\log\\left(\\frac 1{\\sqrt{2 \\pi}}\\exp\\left(-\\frac 1 2 (w_j^Tx^{(i)})^2\\right)\\right)\\\\\n",
    "&= -m\\log \\sqrt{2 \\pi} - \\frac 1 2 \\sum_{i=1}^m w_j^Tx^{(i)}(x^{(i)})^Tw_j\\\\\n",
    "&= -m\\log \\sqrt{2 \\pi} - \\frac 1 2 w_j^TX^TXw_j\\\\\n",
    "&= -m\\log \\sqrt{2 \\pi} - \\frac 1 2 e_j^TWX^TXW^Te_j,\n",
    "\\end{align*}$$\n",
    "where $e_1,\\ldots, e_n$ is the standard basis of $\\mathbb R^n$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From this we find\n",
    "$$\\begin{align*}\n",
    "\\sum_{i=1}^m\\sum_{j=1}^n \\log g'(w_j^Tx^{(i)}) &=-mn\\log \\sqrt{2 \\pi} - \\frac 1 2 \\sum_{j=1}^n  e_j^TWX^TXW^Te_j\\\\\n",
    "&= -mn\\log \\sqrt{2 \\pi} -\\frac 1 2 \\mathrm{tr}(WX^TXW^T)\\\\\n",
    "&= -mn\\log \\sqrt{2 \\pi} -\\frac 1 2 \\mathrm{tr}(W^TWX^TX).\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Applying the derivative of the left hand side to a matrix $V$ therefore gives\n",
    "$$\\begin{align*}\n",
    "\\left(\\mathrm d_W\\sum_{i=1}^m\\sum_{j=1}^n \\log g'(w_j^Tx^{(i)})\\right) V\n",
    "&= -\\frac 1 2 \\mathrm{tr}(V^TWX^TX)-\\frac 1 2 \\mathrm{tr}(W^TVX^TX)\\\\\n",
    "&= - \\mathrm{tr}(V^TWX^TX)\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From which \n",
    "$$ \\nabla_W\\sum_{i=1}^m\\sum_{j=1}^n \\log g'(w_j^Tx^{(i)}) = -WX^TX$$\n",
    "follows."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After recalling that\n",
    "$$ \\nabla_W \\log |W| = (W^{-1})^T $$\n",
    "we therefore find"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "\\nabla_W \\ell(W) = m (W^{-1})^T - WX^TX.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Any $W$ that maximizes $\\ell(W)$ must hence satisfy the equation\n",
    "$$ W^TW = m (X^TX)^{-1}.$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If $W$ satisfies this equation, then so does $W':= PW$ for any orthogonal matrix $P$:\n",
    "$$ W'^T W' = W^TP^TPW = W^TIW = W^TW = m(X^TX)^{-1}.$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In fact the above calculations show that this rotational invariance is already present in the log likelihood, i.e.$\\ell(PW) = \\ell (W)$ holds for all matrices $W$ and all orthogonal matrices $P$ (because $|PW| = |P|\\cdot |W| = |W|$)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (b) Laplace source"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In general we have \n",
    "$$\\begin{align*}\n",
    "\\left(\\mathrm d_W \\log g'(w_j^Tx^{(i)}) \\right)V &= \\frac{g'(w_j^Tx^{(i)})}{g''(w_j^Tx^{(i)})} e_j^T V x^{(i)}\\\\\n",
    "&= \\frac{g'(w_j^Tx^{(i)})}{g''(w_j^Tx^{(i)})} \\mathrm{tr}\\left((x^{(i)})^T e_j^T V\\right)\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "and therefore\n",
    "$$\\begin{align*}\n",
    "\\nabla_W  \\log g'(w_j^Tx^{(i)}) &= \\frac{g'(w_j^Tx^{(i)})}{g''(w_j^Tx^{(i)})} \\left((x^{(i)})^T e_j^T\\right)^T\\\\\n",
    "&=\\frac{g'(w_j^Tx^{(i)})}{g''(w_j^Tx^{(i)})} e_j(x^{(i)})^T.\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So we can write the gradient of the log-likelihood  (for a single example, i.e. $m=1$) as\n",
    "$$\\begin{align*}\n",
    "\\nabla_W \\ell(W) &= (W^T)^{-1} + \\sum_{j=1}^n\\frac{g'(w_j^Tx^{(i)})}{g''(w_j^Tx^{(i)})} e_j(x^{(i)})^T\\\\\n",
    "&= (W^T)^{-1} + \\begin{bmatrix} \n",
    "\\frac{g'(w_1^Tx^{(i)})}{g''(w_1^Tx^{(i)})}\\\\\n",
    "\\vdots\\\\\n",
    "\\frac{g'(w_n^Tx^{(i)})}{g''(w_n^Tx^{(i)})}\n",
    "\\end{bmatrix} (x^{(i)})^T.\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In particular for a Laplacian density $g'(s)=f_{\\mathcal L}(s) = \\frac 1 2 \\exp(-|s|)$ we get the gradient update rule\n",
    "$$\\nabla_W \\ell(W) = W +\\alpha\\left((W^T)^{-1} - \\begin{bmatrix} \n",
    "\\mathrm{sgn}(w_1^Tx^{(i)})\\\\\n",
    "\\vdots\\\\\n",
    "\\mathrm{sgn}(w_n^Tx^{(i)})\n",
    "\\end{bmatrix} (x^{(i)})^T\\right)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (c) Cocktail Party Problem"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import scipy.io.wavfile"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Implement the gradient ascent update:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def update_W(W, x, learning_rate):\n",
    "    \"\"\"\n",
    "    Perform a gradient ascent update on W using data element x and the provided learning rate.\n",
    "\n",
    "    This function should return the updated W.\n",
    "\n",
    "    Use the laplace distribiution in this problem.\n",
    "\n",
    "    Args:\n",
    "        W: The W matrix for ICA\n",
    "        x: A single data element\n",
    "        learning_rate: The learning rate to use\n",
    "\n",
    "    Returns:\n",
    "        The updated W\n",
    "    \"\"\"\n",
    "    \n",
    "    # *** START CODE HERE ***\n",
    "    W_T_inv = np.linalg.inv(W.T)\n",
    "    x = x.reshape((-1,1))\n",
    "    grad = W_T_inv - np.sign(W @ x) @ x.T\n",
    "    updated_W = W + learning_rate * grad\n",
    "    # *** END CODE HERE ***\n",
    "    \n",
    "    return updated_W"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compute the isolated components $s^{(i)} = Wx^{(i)}$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "def unmix(X, W):\n",
    "    \"\"\"\n",
    "    Unmix an X matrix according to W using ICA.\n",
    "\n",
    "    Args:\n",
    "        X: The data matrix\n",
    "        W: The W for ICA\n",
    "\n",
    "    Returns:\n",
    "        A numpy array S containing the split data\n",
    "    \"\"\"\n",
    "\n",
    "    S = np.zeros(X.shape)\n",
    "\n",
    "\n",
    "    # *** START CODE HERE ***\n",
    "    S = X @ W.T\n",
    "    # *** END CODE HERE ***\n",
    "\n",
    "    return S"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Utility functions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "Fs = 11025\n",
    "\n",
    "def normalize(dat):\n",
    "    return 0.99 * dat / np.max(np.abs(dat))\n",
    "\n",
    "def load_data():\n",
    "    mix = np.loadtxt('data/mix.dat')\n",
    "    return mix\n",
    "\n",
    "def save_sound(audio, name):\n",
    "    scipy.io.wavfile.write('output/{}.wav'.format(name), Fs, audio)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define the learning algorithm with learning rate that decreases over time:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def unmixer(X):\n",
    "    M, N = X.shape\n",
    "    W = np.eye(N)\n",
    "\n",
    "    anneal = [0.1 , 0.1, 0.1, 0.05, 0.05, 0.05, 0.02, 0.02, 0.01 , 0.01, 0.005, 0.005, 0.002, 0.002, 0.001, 0.001]\n",
    "    print('Separating tracks ...')\n",
    "    for lr in anneal:\n",
    "        print(lr)\n",
    "        rand = np.random.permutation(range(M))\n",
    "        for i in rand:\n",
    "            x = X[i]\n",
    "            W = update_W(W, x, lr)\n",
    "\n",
    "    return W"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load the data and scale it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(53442, 5)\n"
     ]
    }
   ],
   "source": [
    "X = normalize(load_data())\n",
    "\n",
    "print(X.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Extract five sound files with the mixed sounds:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "for i in range(X.shape[1]):\n",
    "    save_sound(X[:, i], 'mixed_{}'.format(i))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can find the files **mixed_0.wav, ..., mixed_4.wav**  in the output folder.\n",
    "\n",
    "***Note: If you have trouble playing the .wav files, try VLC player.*** "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can run independent components analysis to split the mixed sounds and save the results as **split_0.wav, ..., split_4.wav** in the output folder:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Separating tracks ...\n",
      "0.1\n",
      "0.1\n",
      "0.1\n",
      "0.05\n",
      "0.05\n",
      "0.05\n",
      "0.02\n",
      "0.02\n",
      "0.01\n",
      "0.01\n",
      "0.005\n",
      "0.005\n",
      "0.002\n",
      "0.002\n",
      "0.001\n",
      "0.001\n",
      "[[ 52.8323537   16.7959263   19.94470853 -10.19926121 -20.89890051]\n",
      " [ -9.92157731  -0.97003031  -4.64037237   8.0318867    1.77653597]\n",
      " [  8.3010851   -7.47767761  19.31146438  15.19230685 -14.33882204]\n",
      " [-14.65903801 -26.63185079   2.44381882  21.37311935  -8.41287637]\n",
      " [ -0.27478148  18.38479412   9.3139653    9.11002354  30.60715687]]\n"
     ]
    }
   ],
   "source": [
    "W = unmixer(X)\n",
    "print(W)\n",
    "S = normalize(unmix(X, W))\n",
    "for i in range(S.shape[1]):\n",
    "    save_sound(S[:, i], 'split_{}'.format(i))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
